Copernicus: A Scalable, High-Performance Semantic File System

نویسندگان

  • Andrew W. Leung
  • Aleatha Parker-Wood
  • Ethan L. Miller
چکیده

Hierarchical file systems do not effectively meet the needs of users at the petabyte-scale. Users need dynamic, search-based file access in order to properly manage and use their growing sea of data. This paper presents the design of Copernicus, a new scalable, semantic file system that provides a searchable namespace for billions of files. Instead of augmenting a traditional file system with a search index, Copernicus uses a dynamic, graph-based file system design that indexes file attributes and relationships to provide scalable search and navigation of files.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

The Edge Node File System: A Distributed File System for High Performance Computing

The concept of using Internet edge nodes for High Performance Computing (HPC) applications has gained acceptance in recent times. Many of these HPC applications also have large I/O requirements. Consequently, an edge node file system that efficiently manages the large number of files involved can assist in improving application performance significantly. In this paper, we discuss the design of ...

متن کامل

Scalable Performance of the Panasas Parallel File System

The Panasas file system uses parallel and redundant access to object storage devices (OSDs), per-file RAID, distributed metadata management, consistent client caching, file locking services, and internal cluster management to provide a scalable, fault tolerant, high performance distributed file system. The clustered design of the storage system and the use of clientdriven RAID provide scalable ...

متن کامل

CalvinFS: Consistent WAN Replication and Scalable Metadata Management for Distributed File Systems

Existing file systems, even the most scalable systems that store hundreds of petabytes (or more) of data across thousands of machines, store file metadata on a single server or via a shared-disk architecture in order to ensure consistency and validity of the metadata. This paper describes a completely different approach for the design of replicated, scalable file systems, which leverages a high...

متن کامل

PSON: A scalable P2P file sharing system with efficient complex query support

A desired P2P file sharing system is expected to achieve the following design goals: scalability, routing efficiency and complex query support. In this paper, we propose a powerful P2P file sharing system, PSON, which can satisfy all the three desired properties. PSON is essentially a semantic overlay network of logical nodes. Each logical node represents a cluster of peers that are close to ea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009